stability-plasticity dilemma
Are Time Series Foundation Models Susceptible to Catastrophic Forgetting?
Karaouli, Nouha, Coquenet, Denis, Fromont, Elisa, Mermillod, Martial, Reyboz, Marina
Time Series Foundation Models (TSFMs) have shown promising zero-shot generalization across diverse forecasting tasks. However, their robustness to continual adaptation remains underexplored. In this work, we investigate the extent to which TSFMs suffer from catastrophic forgetting when fine-tuned sequentially on multiple datasets. Using synthetic datasets designed with varying degrees of periodic structure, we measure the trade-off between adaptation to new data and retention of prior knowledge. Our experiments reveal that, while fine-tuning improves performance on new tasks, it often causes significant degradation on previously learned ones, illustrating a fundamental stability-plasticity dilemma.
Escaping Stability-Plasticity Dilemma in Online Continual Learning for Motion Forecasting via Synergetic Memory Rehearsal
Lin, Yunlong, Lu, Chao, Wu, Tongshuai, Zhao, Xiaocong, Du, Guodong, Sun, Yanwei, Li, Zirui, Gong, Jianwei
Deep neural networks (DNN) have achieved remarkable success in motion forecasting. However, most DNN-based methods suffer from catastrophic forgetting and fail to maintain their performance in previously learned scenarios after adapting to new data. Recent continual learning (CL) studies aim to mitigate this phenomenon by enhancing memory stability of DNN, i.e., the ability to retain learned knowledge. Yet, excessive emphasis on the memory stability often impairs learning plasticity, i.e., the capacity of DNN to acquire new information effectively. To address such stability-plasticity dilemma, this study proposes a novel CL method, synergetic memory rehearsal (SyReM), for DNN-based motion forecasting. SyReM maintains a compact memory buffer to represent learned knowledge. To ensure memory stability, it employs an inequality constraint that limits increments in the average loss over the memory buffer. Synergistically, a selective memory rehearsal mechanism is designed to enhance learning plasticity by selecting samples from the memory buffer that are most similar to recently observed data. This selection is based on an online-measured cosine similarity of loss gradients, ensuring targeted memory rehearsal. Since replayed samples originate from learned scenarios, this memory rehearsal mechanism avoids compromising memory stability. We validate SyReM under an online CL paradigm where training samples from diverse scenarios arrive as a one-pass stream. Experiments on 11 naturalistic driving datasets from INTERACTION demonstrate that, compared to non-CL and CL baselines, SyReM significantly mitigates catastrophic forgetting in past scenarios while improving forecasting accuracy in new ones. The implementation is publicly available at https://github.com/BIT-Jack/SyReM.
A Simple Baseline for Stable and Plastic Neural Networks
Kรผnzel, รtienne, Jaziri, Achref, Ramesh, Visvanathan
Continual learning in computer vision requires that models adapt to a continuous stream of tasks without forgetting prior knowledge, yet existing approaches often tip the balance heavily toward either plasticity or stability. We introduce RDBP, a simple, low-overhead baseline that unites two complementary mechanisms: ReLUDown, a lightweight activation modification that preserves feature sensitivity while preventing neuron dormancy, and Decreasing Backpropagation, a biologically inspired gradient-scheduling scheme that progressively shields early layers from catastrophic updates. Evaluated on the Continual ImageNet benchmark, RDBP matches or exceeds the plasticity and stability of state-of-the-art methods while reducing computational cost. RDBP thus provides both a practical solution for real-world continual learning and a clear benchmark against which future continual learning strategies can be measured. Continual learning in computer vision tackles the fundamental challenge of enabling models to adapt to a continuous stream of visual information rather than to a single static dataset. Such systems must continuously integrate new concepts while retaining the features and representations learned from previous tasks.
Lifelong Learning with Task-Specific Adaptation: Addressing the Stability-Plasticity Dilemma
Wang, Ruiyu, Wang, Sen, Zuo, Xinxin, Sun, Qiang
Lifelong learning (LL) aims to continuously acquire new knowledge while retaining previously learned knowledge. A central challenge in LL is the stability-plasticity dilemma, which requires models to balance the preservation of previous knowledge (stability) with the ability to learn new tasks (plasticity). While parameter-efficient fine-tuning (PEFT) has been widely adopted in large language models, its application to lifelong learning remains underexplored. To bridge this gap, this paper proposes AdaLL, an adapter-based framework designed to address the dilemma through a simple, universal, and effective strategy. AdaLL co-trains the backbone network and adapters under regularization constraints, enabling the backbone to capture task-invariant features while allowing the adapters to specialize in task-specific information. Unlike methods that freeze the backbone network, AdaLL incrementally enhances the backbone's capabilities across tasks while minimizing interference through backbone regularization. This architectural design significantly improves both stability and plasticity, effectively eliminating the stability-plasticity dilemma. Extensive experiments demonstrate that AdaLL consistently outperforms existing methods across various configurations, including dataset choices, task sequences, and task scales.
Structural features of the fly olfactory circuit mitigate the stability-plasticity dilemma in continual learning
Zou, Heming, Zang, Yunliang, Ji, Xiangyang
These authors contribute equally to this work. Abstract Artificial neural networks face the stability-plasticity dilemma in continual learning, while the brain can maintain memories and remain adaptable. However, the biological strategies for continual learning and their potential to inspire learning algorithms in neural networks are poorly understood. This study presents a minimal model of the fly olfactory circuit to investigate the biological strategies that support continual odor learning. We introduce the fly olfactory circuit as a plug-and-play component, termed the Fly Model, which can integrate with modern machine learning methods to address this dilemma. Our findings demonstrate that the Fly Model enhances both memory stability and learning plasticity, overcoming the limitations of current continual learning strategies. We validated its effectiveness across various challenging continual learning scenarios using commonly used datasets. The fly olfactory system serves as an elegant biological circuit for lifelong learning, offering a module that enhances continual learning with minimal additional computational cost for machine learning. When learning new tasks and updating parameters, these models inevitably overwrite previously learned patterns, resulting in "catastrophic forgetting" [1-3]. This critical flaw has become the Achilles' heel of neural network models, preventing them from realizing their full potential. Conversely, especially under long non-stationary data streams, the parameters of network models may become less effective at updating, resulting in a gradual decline in their ability to adapt to new information. This issue, known as plasticity loss, has garnered increasing attention in recent years [4-5].
On the Stability-Plasticity Dilemma in Continual Meta-Learning: Theory and Algorithm
We focus on Continual Meta-Learning (CML), which targets accumulating and exploiting meta-knowledge on a sequence of non-i.i.d. The primary challenge is to strike a balance between stability and plasticity, where a model should be stable to avoid catastrophic forgetting in previous tasks and plastic to learn generalizable concepts from new tasks. To address this, we formulate the CML objective as controlling the average excess risk upper bound of the task sequence, which reflects the trade-off between forgetting and generalization. Based on the objective, we introduce a unified theoretical framework for CML in both static and shifting environments, providing guarantees for various task-specific learning algorithms. Moreover, we first present a rigorous analysis of a bi-level trade-off in shifting environments.
Balancing Stability and Plasticity through Advanced Null Space in Continual Learning
Kong, Yajing, Liu, Liu, Wang, Zhen, Tao, Dacheng
Continual learning is a learning paradigm that learns tasks sequentially with resources constraints, in which the key challenge is stability-plasticity dilemma, i.e., it is uneasy to simultaneously have the stability to prevent catastrophic forgetting of old tasks and the plasticity to learn new tasks well. In this paper, we propose a new continual learning approach, Advanced Null Space (AdNS), to balance the stability and plasticity without storing any old data of previous tasks. Specifically, to obtain better stability, AdNS makes use of low-rank approximation to obtain a novel null space and projects the gradient onto the null space to prevent the interference on the past tasks. To control the generation of the null space, we introduce a non-uniform constraint strength to further reduce forgetting. Furthermore, we present a simple but effective method, intra-task distillation, to improve the performance of the current task. Finally, we theoretically find that null space plays a key role in plasticity and stability, respectively. Experimental results show that the proposed method can achieve better performance compared to state-of-the-art continual learning approaches.
Can sea slugs help make AI smarter? - Futurity
You are free to share this article under the Attribution 4.0 International license. For artificial intelligence to get any smarter, it needs first to be as intelligent as one of the simplest creatures in the animal kingdom: the sea slug. Researchers have found that a material can mimic the sea slug's most essential intelligence features. The discovery is a step toward building hardware that could help make AI more efficient and reliable for technology ranging from self-driving cars and surgical robots to social media algorithms. "Through studying sea slugs, neuroscientists discovered the hallmarks of intelligence that are fundamental to any organism's survival," says Shriram Ramanathan, a professor of materials engineering at Purdue University.
Taking lessons from a sea slug, study points to better hardware for artificial intelligence
Researchers mimic the animal kingdom's most basic signs of intelligence in quantum material WEST LAFAYETTE, Ind. -- For artificial intelligence to get any smarter, it needs first to be as intelligent as one of the simplest creatures in the animal kingdom: the sea slug. A new study has found that a material can mimic the sea slug's most essential intelligence features. The discovery is a step toward building hardware that could help make AI more efficient and reliable for technology ranging from self-driving cars and surgical robots to social media algorithms. The study, publishing this week in the Proceedings of the National Academy of Sciences, was conducted by a team of researchers from Purdue University, Rutgers University, the University of Georgia and Argonne National Laboratory. "Through studying sea slugs, neuroscientists discovered the hallmarks of intelligence that are fundamental to any organism's survival," said Shriram Ramanathan, a Purdue professor of materials engineering.